Facing issue building a simple RAG application using RetrievalQA

Shaleensr · May 30, 2025, 5:54am

Hi all, Here is a simple RAG app code, I am facing issue running it. Issue probably lies in the HuggingFaceEndpoint’s repo_id used. Could you please guide me what repo_id should be used here? Also, how to identify when to use which repo_id?

Here is the code:

from langchain_community.document_loaders import TextLoader

from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain_huggingface import HuggingFaceEmbeddings, HuggingFaceEndpoint

from langchain_community.vectorstores import FAISS

from langchain.chains import RetrievalQA

from dotenv import load_dotenv

load_dotenv()

# Loading a text file

loader = TextLoader("./Dummy_docs/Test_doc_1.txt")

documents = loader.load()

# Splitting the text into manageable chunks

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)

chunks = splitter.split_documents(documents)

# Converting document chunks into embeddings and store in FAISS

embedding_model = HuggingFaceEmbeddings()

vectorstore = FAISS.from_documents(chunks, embedding_model)

# Creating a retriever from the vectorstore

retriever = vectorstore.as_retriever()

# Initializing the LLM endpoint

llm = HuggingFaceEndpoint(

repo_id="tiiuae/falcon-7b-instruct",

provider="auto"

)

# Create the RetrievalQA chain

qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

# Define your user query

query = "Can I cancel my e-ticket booked online from Busways, at the counter?"

print(f"\nQuery: {query}\n")

# Get the answer

answer = qa_chain.run(query)

print("Answer:")

print(answer)

I am a beginner in GenAI. Than you for your support.

John6666 · May 30, 2025, 9:47am

The format of repo_id is correct, but since the model has not been deployed, it cannot be used as an API. Refer to the following post to find a model that can be used. Using a local model is more reliable, but requires a GPU.

Shaleensr · May 30, 2025, 11:59am

Got yu! I went ahead to check the supporting inference points, and it worked using this:

Initializing the LLM endpoint

llm = HuggingFaceEndpoint(
repo_id=“HuggingFaceH4/zephyr-7b-beta”
)

Thanks again!

Topic		Replies	Views
How do I use the RagRetriever to retrieve documents? (What is the question_hidden_states variable and how do make it?) Beginners	1	608	March 18, 2024
Getting Additional response from my RAG using HuggingFaceEndpoint inference Beginners	3	33	March 16, 2025
How to Use HuggingFace free Embedding models Beginners	3	4867	October 7, 2024
RAG isnt working as expected Beginners	3	225	May 2, 2024
RepositoryNotFoundError: 404 Client Error Beginners	2	11870	August 31, 2023

Facing issue building a simple RAG application using RetrievalQA

Initializing the LLM endpoint

Related topics